Cross-linguistic Data-driven Measurement of Child Language Development
نویسنده
چکیده
In order to solve the puzzle of child language acquisition, researchers have developed a number of metrics. All of these metrics involve examining a transcript of utterances, and producing a score that corresponds to language development. The most commonly used metrics rely on surface-level features, which are used due to computational ease and availability. An alternative approach is to, given a transcript, produce a number corresponding to the age of the child examined in the transcript. Using this approach, we can track a child’s linguistic development based on age—a number easily understood without intimate knowledge of the process. This age prediction task has been approached from a Machine Learning perspective, involving the extraction of features from transcripts and using these to predict age [SS12, LS14]. This research has shown that simple features correlate well with age; however, it has only been done for English-speaking children. In this paper, we explore this age prediction approach for language development to children speaking other languages. Using transcripts of children speaking Spanish, Japanese, and Hebrew from the CHILDES database, we examine the age prediction task using similar simple syntactic feature templates as those that were used in previous research in English. We find that approaches using only syntactic features perform at comparable levels to those using more languagespecific, content-based features. Our best results show the ability to predict a child’s age based on syntactic features from transcripts within two months of their actual age, with strong correlations between predicted and actual age for the data set. Additionally, we compare performance to contentbased features and find no significant improvement over syntactic features. We suggest future experiments in order to determine the best feature sets for a given language, and call for increased data collection in this area. With the increased availability of child language transcripts, such a cross-linguistic data-driven approach has the ability to influence, motivate, and assist the research area of child language development.
منابع مشابه
The Effects of Task Complexity on Input-Driven Uptake of Salient Linguistic Features
The present study investigated the effects of cognitive complexity of pedagogical tasks on the learners’ uptake of salient features in the input. For the purpose of data collection, three versions of a decision-making task (simple, mid, and complex) were employed. Three intact classes (each 20 language learners) were randomly assigned to three groups. Each group transacted a version of a decis...
متن کاملLanguage development and acquisition in children
Language acquisition is a natural developmental process and is unique to Homo sapiens in which a child acquiring his or her mother tongue as a first language. The simplest theory of language development is that children learn language by imitating adult language. A second possibility is that children acquire language through conditioning. Noam Chomsky put forward innateness hypothesis. Piaget ...
متن کاملThe Influence of Data-Driven Exercises Through Using a Computer Program on Vocabulary Improvement in an EFL Context
The present study was conducted to evaluate data driven learning (DDL) combined with Computer Assisted Language Learning (CALL) as an approach to improving vocabulary knowledge of Iranian postgraduates majoring in teaching English, English literature and translation. The purpose was to help language learners get familiar with DDL as a student-centered method taking advantage of a computer progr...
متن کاملThe Influence of Data-Driven Exercises Through Using a Computer Program on Vocabulary Improvement in an EFL Context
The present study was conducted to evaluate data driven learning (DDL) combined with Computer Assisted Language Learning (CALL) as an approach to improving vocabulary knowledge of Iranian postgraduates majoring in teaching English, English literature and translation. The purpose was to help language learners get familiar with DDL as a student-centered method taking advantage of a computer progr...
متن کاملAttitudes towards English Language Norms in the Expanding Circle: Development and Validation of a new Model and Questionnaire
This paper describes the development and validation of a new model and questionnaire to measure Iranian English as a foreign language learners’ attitudes towards the use of native versus non-native English language norms. Based on a comprehensive review of the related literature and interviews with domain experts, five factors were identified. A draft version of a questionnaire based on those f...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2015